Lenient Learning in Independent-Learner Stochastic Cooperative Games
نویسندگان
چکیده
We introduce the Lenient Multiagent Reinforcement Learning 2 (LMRL2) algorithm for independent-learner stochastic cooperative games. LMRL2 is designed to overcome a pathology called relative overgeneralization, and to do so while still performing well in games with stochastic transitions, stochastic rewards, and miscoordination. We discuss the existing literature, then compare LMRL2 against other algorithms drawn from the literature which can be used for games of this kind: traditional (“Distributed”) Q-learning, Hysteretic Q-learning, WoLF-PHC, SOoN, and (for repeated games only) FMQ. The results show that LMRL2 is very effective in both of our measures (complete and correct policies), and is found in the top rank more often than any other technique. LMRL2 is also easy to tune: though it has many available parameters, almost all of them stay at default settings. Generally the algorithm is optimally tuned with a single parameter, if any. We then examine and discuss a number of side-issues and options for LMRL2.
منابع مشابه
Reinforcement Learning in Multi-agent Games
This article investigates the performance of independent reinforcement learners in multiagent games. Convergence to Nash equilibria and parameter settings for desired learning behavior are discussed for Q-learning, Frequency Maximum Q value (FMQ) learning and lenient Q-learning. FMQ and lenient Q-learning are shown to outperform regular Q-learning significantly in the context of coordination ga...
متن کاملSolving a Two-Period Cooperative Advertising Problem Using Dynamic Programming
Cooperative advertising is a cost-sharing mechanism in which a part of retailers' advertising investments are financed by the manufacturers. In recent years, investment among advertising options has become a difficult marketing issue. In this paper, the cooperative advertising problem with advertising options is investigated in a two-period horizon in which the market share in the second period...
متن کاملEmpirical and theoretical support for lenient learning
Recently, an evolutionary model of Lenient Q-learning (LQ) has been proposed, providing theoretical guarantees of convergence to the global optimum in cooperative multi-agent learning. However, experiments reveal discrepancies between the predicted dynamics of the evolutionary model and the actual learning behavior of the Lenient Q-learning algorithm, which undermines its theoretical foundation...
متن کاملReinforcement social learning of coordination in cooperative multiagent systems
Coordination in cooperative multiagent systems is an important problem and has received a lot of attention in multiagent learning literature. Most of previous works study the problem of how two (or more) players can coordinate on Pareto-optimal Nash equilibrium(s) through fixed and repeated interactions in the context of cooperative games. However, in practical complex environments, the interac...
متن کاملThe Impact of Fostering Learner Autonomy through Implementing Cooperative Learning Strategies on Inferential Reading Comprehension Ability of Iranian EFL Learners
Abstract The great shift of paradigm from teacher-centeredness to learner-centeredness has one major rationale in line with the definitions of autonomy, i.e., the capacity and willingness to act independently and in cooperation with others, so cooperation is looked upon as the manifestation of autonomy. In the present study, the researchers investigated the impact of training cooperative...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 17 شماره
صفحات -
تاریخ انتشار 2016